The CT Department of Public Health releases data that provides select birth outcomes (birthweight, gestational age) for Connecticut resident births along with select demographic, behavioral, and socioeconmic characteristics about the mother.

CTdata.org provides these data sets in 1, 3, and 5 year averages due some of the levels showing little data at the one year level. Therefore, depending on which combination of characteristics the user is interested in, data may not be available at the 1 or 3 level data, instead only available at the 5 year level.

But how can the user know at which level should they start exploring?

That’s where these tables come in.

The tables found in this repository are meant to guide the user to the earliest level of aggregation in which data are available for a given town and combination of characteristics.

For example, are you looking for how many births took place in Hartford to mother’s age 45 years and older and that carried to full term?

You’ll have to first look that the 3-year aggregation data set for that information, becasue no data are available in the 1-year data set. These are the types of questions that can be answered with these tables.

Please note that these tables reflect availabilities for the latest years available. If no data are available for a given combination of characteristics, the Available column will be blank.

Following code example shows the step by step process for creating the search tables, in particular, the Demographics data sets:

Bring in individual aggregations

demo_1 <- read.csv(paste0(path_to_data, "/", "maternal_characteristics_demographics_1-Year.csv"), stringsAsFactors = F, header=T)
demo_3 <- read.csv(paste0(path_to_data, "/", "maternal_characteristics_demographics_3-Year.csv"), stringsAsFactors = F, header=T)
demo_5 <- read.csv(paste0(path_to_data, "/", "maternal_characteristics_demographics_5-Year.csv"), stringsAsFactors = F, header=T)

Combine all aggregations

demo_all <- rbind(demo_1, demo_3, demo_5)

Isolate to latest years and numbers only (no percent values)

years <- c("2014", "2012-2014", "2010-2014")
demo_select <- demo_all[demo_all$Measure.Type == "Number" & demo_all$Year %in% years,]
demo_select$Year <- factor(demo_select$Year, levels = c("2014", "2012-2014", "2010-2014"))

Create column that checks to see if data is available for a given year

demo_select2 <- demo_select %>% 
  select(-c(Measure.Type, FIPS)) %>% 
  arrange(Town, Birth.Weight, Gestational.Age, Mother.s.Age, Mother.s.Race.Ethnicity) %>% 
  group_by(Town, Birth.Weight, Gestational.Age, Mother.s.Age, Mother.s.Race.Ethnicity, Year) %>% 
  mutate(value_avail = ifelse((Value > 0), 1, 0)) 

Based on availability column, finds groups that have data vs groups with no data

demo_select3 <- demo_select2 %>% 
  group_by(`Town`, Birth.Weight, Gestational.Age, Mother.s.Age, Mother.s.Race.Ethnicity) %>% 
  mutate(max_value_avail = max(value_avail))

Conditionally assigns year_avail based on comparison of availabilty column and max column

demo_select4 <- demo_select3 %>% 
  mutate(year_avail = ifelse((value_avail == 0 & max_value_avail == 0), "None", NA), 
         year_avail = ifelse((value_avail == 0 & max_value_avail == 1), NA, NA), 
         year_avail = ifelse((value_avail == 1 & max_value_avail == 1), Year, NA))

Assigns all NAs to 4 (so they may be sorted)

demo_select4[is.na(demo_select4)] <- 4

Create column that reflects min year in which data is available

demo_select5 <- demo_select4 %>% 
  group_by(`Town`, Birth.Weight, Gestational.Age, Mother.s.Age, Mother.s.Race.Ethnicity) %>% 
  mutate(min_year = min(year_avail))

Set up for assignments

demo_select5 <- as.data.frame(demo_select5)

Assign year available based on value of min year column

demo_select6 <- demo_select5 %>% 
  mutate(Earliest.Data.Available = ifelse(min_year  == 1, "2014",
                    ifelse(min_year== 2, "2012-2014",
                    ifelse(min_year == 3, "2010-2014",
                    ifelse(min_year == 4, NA, NA)))))

Remove extra columns

demo_available <- demo_select6 %>% 
  select(-c(year_avail, Year, Variable, Value, value_avail, max_value_avail, year_avail, min_year))

Remove duplicates (isolate latest year available)

demo_available <- demo_available[!duplicated(demo_available), ]

Take a peak at the table with each step represented in a column

table <- head(demo_select6_print)
knitr::kable(table)
Town Year Birth.Weight Gestational.Age Mother.s.Age Mother.s.Race.Ethnicity Variable Value value_avail max_value_avail year_avail min_year Earliest.Data.Available
Andover 2014 All 37 weeks or more 0 to 14 years All Births 0 0 0 4 4 NA
Andover 2012-2014 All 37 weeks or more 0 to 14 years All Births 0 0 0 4 4 NA
Andover 2010-2014 All 37 weeks or more 0 to 14 years All Births 0 0 0 4 4 NA
Andover 2014 All 37 weeks or more 15 to 19 years All Births 0 0 1 4 3 2010-2014
Andover 2012-2014 All 37 weeks or more 15 to 19 years All Births 0 0 1 4 3 2010-2014
Andover 2010-2014 All 37 weeks or more 15 to 19 years All Births 1 1 1 3 3 2010-2014

Finished, searchable product

Some Observations

The overall annual trend dating back to 1999, shows that the total number of births occuring in CT has declined from just over 86,000 in 1999 to just over 72,000 in 2014.

When we break down those totals into age groups, we see a decline in teen births, and an increase in mother’s aged 45+.

#### When we break down those totals into education levels, we see an increase over time in the number of mother’s who have received 17 years or more of education, while mother’s with high school, GED, or college level education have all decreased.

#### When we break down those totals into groups indentified by when the mother’s first intiated prenatal care, we see the majority of mother’s inititating prenatal care within the first trimester. We’ve also broken out the total number of mother’s considered in the “Late” category. Late prenatal care is the initiation of prenatal care after the first trimester of pregnancy.

Now let’s call out some interesting facts about towns across the state.

Each town overall